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Abstract 

This paper addresses the problem of computing the Cramer-Rao bound for the parameters of 
a Markov random field. This bound depends on the derivatives of a likelihood distribution that is 
generally intractable. It is established that by exploiting a property of the exponential family, this 
intractable bound can be related to the statistical moments of the Gibbs potential of the Markov 
random field. A derivative-free Monte Carlo algorithm is then proposed to estimate this moments and 
compute the bound. To illustrate the interest of this method, the proposed algorithm is successfully 
applied to the Ising and Potts models. The resulting bounds are used to assess the performance of 
three state-of-the art estimators of the parameter of these Markov random fields. 

Index Terms 

Cramer Rao bound, Monte Carlo algorithms, Markov random fields, Intractable distributions. 

I. Introduction 

The estimation of parameters involved in intractable statistical models (i.e., with intractable 
likelihood distributions) is a difficult problem that has received significant attention in the 
computational statistics and signal processing literature [Q]|, El, 0. Particularly, estimating 
the parameters of a Markov random field (MRF) is an active research topic in image pro- 
cessing [U, 0, H, 0. Several new estimators have been recently derived, mainly based 
on efficient Monte Carlo or variational approximations JT), ED, 0, 0, 0. 
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This paper addresses the problem of deriving the Cramer-Rao bound (CRB) IfTOll for 
estimators of MRF parameters. The CRB establishes a theoretical lower limit on how much 
information a set of observations carries about unknown parameters. Specifically, it defines 
the minimum variance for any unbiased estimator of these parameters. The CRB is practically 
used as a means to characterize the performance of unbiased estimators in terms of mean 
square error (i.e., estimation variance). Unfortunately, the CRB for most MRF models is 
difficult to compute because their probability density functions (pdf) are intractable [TTTl Ch. 
7] . One could think of exploiting the estimation techniques referenced above to first compute 
an approximate pdf and then evaluate the CRB numerically. However, this would lead to 
a poor approximation because the numerical operations involved in the computation of the 
CRB are ill-conditioned. 

This paper proposes to exploit an interesting property of the exponential family to compute 
the CRB of MRF parameters directly. Precisely, it is shown that it is possible to express this 
CRB in terms of second order statistical moments. An original Monte Carlo (MC) algorithm 
is then proposed to estimate these moments and compute the CRB. The proposed CRB 
estimation method is illustrated on specific MRF models that have been widely used in the 
image processing community, namely the Potts and Ising models. Indeed, these models have 
been successfully used to capture spatial correlations between neighboring pixels in several 
segmentation and/or classification problems Ifl2ll . |fT3l , Ifl4l . The remainder of the paper 
is organized as follows: The class of distributions considered in this work is introduced in 
Section II. Section III. A shows that for these distributions the CRB of the unknown parameters 
can be expressed as a covariance measure. The MC algorithm proposed to estimate this 
covariance is presented in Section III.B. The application of the proposed method to the Ising 
and Potts models is presented in section IV. Conclusions are finally reported in Section V. 

II. Problem Statement 

Let 6 = (#!,..., 6 M ) be an unknown parameter vector and z = (zi, z 2 , . . . , z N ) an 
observation vector whose elements take their values in a set Vt. This paper considers the 
case where 6 and z are related by the following generic pdf 

f(z\e)^-^-ex P [e^(z)] (i) 
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where $(2) : Q N — > M A/ is a sufficient statistic and C(0) is the normalizing constant given 
by 

C(0) = [ exp [0$(z)]dz. (2) 
Jn N 

Note that the model ([[]) defines a subclass of the exponential family. It comprises several 
standard distributions, such as Gaussian or Poisson distributions, as well as multivariate 
distributions frequently used in signal and image processing applications, such as Markov 
random fields [[TTIl . In this latter case, the normalizing constant, often referred to as partition 
function [15 J is generally intractable. 

The Cramer-Rao bound of establishes a lower bound on the covariance matrix of any 
unbiased estimator of [10J. Because the existence of the bound requires that f(z\0) 
verifies some weak regularity conditions, we will assume that C(0) is twice continuously 
differentiable (i.e., C(0) E C 2 ). Then the CRB is equal to the inverse of the Fisher information 
matrix (FIM) of OH, i.e., 

cov(0-0) > [X(0)]~ l (3) 



(4) 



where 1(0) is an M x M positive definite matrix whose element (i,j) is given by flTO 

d 2 



dOiOj 



log[f(*|*)] 



and where E z denotes the expectation operator with respect to z. By applying the definition 
@ to ([!} it can be shown that 



Ti,j(0) = E 2 
d 2 



\og[C(0)]. 



(5) 



Unfortunately, evaluating ([5]) for MRF models is rarely possible because of the intractability 
of C(0). One could think of approximating \og[C (0)} by using the following two-step 
procedure: i) first estimate log[C(0)] over a grid of points using an MC scheme akin to 
0, ii) apply a crude numerical differentiation method. However, numerical differentiation is 
ill-conditioned (i.e., amplifies estimation errors), may be difficult to calibrate and introduces 
a discretization error that is hard to evaluate. 

The next section proposes an alternative derivation of the Fisher information matrix that 
avoids any differentiation of log[C(0)]. An MC algorithm is then proposed to estimate the 
FIM of ([5]), without estimating C(0) or computing derivatives. 
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III. Computing the Fisher Information Matrix 1(0) 

This section proposes a derivative-free MC algorithm to estimate 1(0). The proposed 
method is based on the fact that the distributions whose pdf's are defined in ([T|) belong to 
the exponential family. 

A. Derivative -free estimation of 1(0) 

For distributions belonging to the exponential family, the derivatives of the log-partition 
function can be directly related to the statistical moments of their sufficient statistic. 

Proposition 1. The Fisher information matrix of distributions with pdf ([T]) is equal to the 
covariance matrix of their sufficient statistic, i.e., 1(0) = cov [&(z)]. 

Proof: Let z be a vector distributed according to ([T]) and &i(z) : Cl N — > E the ith 
component of the vector $(z) = [$i(z), . . . , §m(z)} . Then, the first and second order 
statistical moments of ®i(z) are given by 

E»[*,(z)] = / nK *,(z)53||M, z (6) 



and 



E z [*,(*)*,(*)] = / ^(z)^(z) ^^ z)] dz. (7) 



1iM = -^V 1 7^77^ (8) 



By developing (Q, we obtain 

C^i)(0) C®(0)C^(0) 
C(0) C 2 (0 

where C(0) has been defined in @ and where C®(0) and C^(0) are given by 

C«(0) 4 ^-C(0) = [ ^(z) exp [0${z)]dz 

= C(0)E z [^(z)] 

G{i ' j){e) ~ Ww c{9) = L [OHz)]dz 

= C(0)E z [Mz)<l> 3 (z)]. 
Finally, by substituting (|9]) and ( flQ] ) into ([8]), the following result is obtained 

l id (0) =E Z - E z [$i(z)] E z 

=cov 



(9) 



(10) 



(11) 
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Note that this alternative expression for Z, ; j does not involve derivatives. Unfortunately 
( [TT] ) is still generally intractable and cannot be evaluated exactly. However, unlike ([5]), the 



statistical moments cov [$>i(z), $j(z)] in ( fTT| ) can be efficiently estimated by MC integration 
OH Chap. 3]. 



B. Proposed Monte Carlo method 

Based on Proposition [Tj we present a MC method to estimate cov [&i(z), $j(z)] in ( [TT] ) 



Precisely, we present a MC algorithm to generate samples {<p^}p =l that are distributed 
according to f($(z)\/3) (see Algo. [TJ below). The generated samples are then used to ap- 
proximate the statistical moments cov [^(z), &j(z)). Once the samples have been generated, 
the elements of the FIM can be approximated using the sample covariance 

^)^E(^-^)(^-^) (12) 

P =i 

where = js J2p=i 4^ ■ Finally, note that step 3 of Algo. [I] requires the simulation of 
variables exactly distributed according to f(z\0). For some distributions in ([TJ this might not 
be possible. In that case, one can resort to a Markov chain Monte Carlo (MCMC) method 
lfT6ll to generate samples approximately distributed according to f(z\0). For instance, most 
MRFs can be efficiently sampled using a Gibbs sampler [T6j, which does not require knowing 
C(0). 

Algorithm 1 Proposed MC algorithm 
l: Input: 6, number of Monte Carlo samples P. 

2: for p = 1 to P do 

3: Generate ~ f(z\0) = ^ exp [0<&(z)] 
4: Set cp {p) = ($ 1 ( Z W),...,$m(2 (p) )) t 

5: end for 



IV. Application to the Ising and Potts Markov Random Fields 

A. Ising and Potts models 

This section presents experimental results conducted to assess the performance of the 
proposed method. The CRB of two important intractable models have been computed, namely 
the Ising and Potts MRFs. In these experiments the simulation of variables z ~ f(z\6) has 
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been approximated by using a Gibbs sampler |fT6ll , which does not require knowing C{0). 
Therefore the proposed method has the advantage of avoiding the need of estimating C{0) 
and computing derivatives numerically. Finally, to simplify their visual interpretation, results 
are displayed using a logarithmic scale. For completeness the Ising and Potts models are 
recalled below. 

Let z = (zi, z 2 , . . . , zn) be a discrete random vector whose elements take their values in 
the finite set Vt K = {1, . . . , K}. The Ising and the Potts MRF are defined as follows 

nm ~ cW) exp[mz)] (13) 

with 

N 

$(Z) = J2 E 6 ( Z n-*n') (14) 
n=l n'eV(ra) 

where V(n) is the index set of the neighbors associated with the nth element, 5(-) is 
the Kronecker function and (3 G IR + is the granularity coefficient or inverse temperature 
parameter. The Gibbs distribution ( [T3| ) corresponds to the Ising MRF when K = 2, and to 
the Potts MRF for K > 3. In our experiments V(n) will be considered to be a bidimensional 
first-order (i.e., 4-pixel) neighborhood structure. However, the proposed method is valid for 
any correct neighborhood structure (see [[Til for more details). 



B. Validation of the proposed method 

To validate the proposed MC method under controlled ground truth conditions (i.e., for a 
known CRB), the algorithm described in paragraph |III-B has been first applied to an Ising 
model defined on a toroidal graph (i.e., with cyclic boundary conditions) of size iV = 32 x 32. 
Unlike most MRF models, the pdf of this particular MRF has a known analytical expression 
IfTTll that leads to the following FIM (6) 



1 



A' 



+ 12iV^ 



z/(/3) 2 A„(/3) _ v"(0)A n (P) 



n=l 



with 



sinh 2 (/3) 

i/(J3) = cosh(/3) 
v"{p) = sinh(/3) + 



and 



A„G0) 



cosh 2 (^) 
sinh(/3) 



A n (/3)t 

cosh(/3) 
~ sinh 2 (/3) 

1 + cosh 2 (/3) 
sinh 3 (/3) 

(2n - 1)tt 



(15) 



cos 



2N 
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A n (/3) = A n (/3) 2 -l. 

Fig. [T] compares the true CRB values computed with ([15]) with the estimates obtained using 
the proposed method. These estimates have been computed from P = 10 000 independent 
MC samples, generated using 10 000 Gibbs moves (these estimates are represented as red 
crosses). In order to illustrate the effect of the number of Gibbs moves used to approximate 
step 3 of Algo. [T] this experiment was repeated using 100 000 Gibbs moves (estimates are 
represented as black circles). To ease visual interpretation, results are displayed using a 
logarithmic scale. We observe that the estimates obtained with the proposed method are in 




Fig. 1. Cramer-Rao bound for an Ising (K — 2) defined on a toroidal graph. Estimates obtained with 10 000 and 100 000 
Gibbs moves are respectively represented as red crosses and black circles. 



good agreement with the true values of the CRB, and that the approximation introduced by 
using a Gibbs sampler within Algo. [T] has not affected the estimation results significantly. 
We also observe that the CRB varies significantly with the value of (3 and that it reaches a 
minimum at approximately (3 = 0.9. Note that the location of this minimum coincides with 
the phase-transition temperature of the Ising MRF {(3 C = log(l + \/2) ~ 0.88). This result is 
in agreement with the fact that the log-partition function \ogC{(3) has its highest derivatives 
around the phase-transition temperature, as noticed in J6]|, [|5]|. Moreover, Fig. [2] illustrates 
the effect of the number of samples P on the CRB estimates for different values of (3. We 
observe that in these experiments the estimates stabilize for P > 1 000. 
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Fig. 2. Estimated CRB (toroidal Ising model) versus the number of MC samples P. 



C. Asymptotic study of the CRB 

The second set of experiments shows the evolution of the CRB with the size of the 
observation vector z (i.e., the number of field components N). The CRB has been computed 
for the following 5 field sizes N = (2 8 , 2 10 , 2 12 , 2 14 , 2 16 ), corresponding to bidimensional 
MRFs of size 16 x 16, 32 x 32, 64 x 64, 128 x 128 and 256 x 256. Experiments have been 
performed using an Ising MRF, a 3-state and a 4-state Potts MRF (i.e., K — 2, K — 3 and 
K = 4 respectively) defined on a plane (not a toroid). CRB estimates have been computed 
using 10 000 Gibbs moves per field component and P = 2 500 samples. Finally, for each 
model, the parameter was set to the critical phase-transition value, i.e., [3 C = log(l + yK) 
to introduce a strong dependency between the components of the MRF. Fig. [3] shows the 
resulting CRBs versus the size of the MRF N in logarithmic scales. 

We observe that for all models the logarithm of the CRB decreases almost linearly with 
the logarithm of the number of field components. This result shows that the strong depen- 
dency between the field components does not modify significantly the linear behavior that 
is generally observed for models defined by statistically independent components. We also 
observe that the CRB decreases with the number of states K, indicating that an accurate 
estimation of (3 for the Ising model is more difficult than for a Potts MRF. 
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D. Evaluation of state-of-the art estimators of (3 

The third set of experiments compares the CRB to the empirical variance of three state-of- 
the art estimation methods, the auxiliary variable (TJ, exchange [2] and ABC algorithms. 
As explained previously, the CRB is often used as a means to measure the performance 
of unbiased estimators in terms of mean square error. In this paper, the three algorithms 
studied in flT), 0, have been used to compute an approximate maximum-likelihood 
(ML) estimation of (3 for the Ising and the 3-state Potts MRF. Experiments were conducted 
as follows. First a random vector z ~ f(z\P) was generated using an appropriate Gibbs 
sampler. After generating z, each estimation method was used to generate 1 000 samples 
approximately distributed according to the intractable likelihood f(z\/3). Finally, for each set 
of samples, an ML estimate was computed by maximizing a non-parametric kernel density 
estimation. This procedure was repeated 2 500 times to compute the variance of the ML 
estimates. All algorithms used 250 burn-in steps and 10 Gibbs moves per auxiliary variable 
coordinate, which are realistic implementation conditions for signal processing applications 
0, lfl8ll . Moreover, the auxiliary variable method JTJ was implemented using the true value 
of (3 as auxiliary estimate, while the tolerance of the ABC method was set to 1% (see 
lfT9ll for more details about these methods and their application to the Ising and Potts MRFs). 
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Fig. |4ja) compares the CRBs computed for the planar Ising model and the variance of the 
ML estimates obtained with the state-of-the art algorithms. These values have been computed 
for (3 < (3 C which is the range of interest in signal processing applications (for (3 > (3 C all the 
field components have almost surely the same color). We observe the good performance of 
ML estimators based on the exchange [2] and the ABC for small values of (3 (i.e., (3 < 0.5). 
However, their performance decreases progressively for f3 > 0.5. This deterioration is possibly 
due to a degradation of the approximate likelihood constructed from the generated samples. 
Moreover, one can see that the ML estimator based on the auxiliary variable method [HI is 
less interesting than the other estimators. This result is in accordance with the experiments 
reported in [0, Q. Furthermore, Fig. |4jb) shows the CRBs computed for a planar 
3-state Potts MRF of size 32 x 32 and (3 < (3 C = log(l + a/3). Again, these values are 
compared to the empirical variance of the state-of-the art methods. We observe that for 
the Potts MRF the CRB also varies significantly with the value of (3 and that it reaches a 
minimum at approximately (3 = 1.0. Again, the location of this minimum coincides with the 
phase-transition temperature f3 c ~ 1.01. Similarly to Fig. |4ja), the ML estimates based on 
the exchange [2] and the ABC Q methods are close to the CRB for small values of (3, and 
depart progressively as (3 increases. 
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Fig. 4. Cramer-Rao bounds for the Ising and Potts models of size 32 x 32. Results are displayed in logarithmic scale. 



V. Conclusion 

This paper studied the problem of computing the CRB for the parameters of Markov ran- 
dom fields. It has been shown that for these distributions the CRB depends on the derivatives 
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of the normalizing constant or partition function C(0), which is generally intractable. An 
original Monte Carlo algorithm was proposed for distributions belonging to the exponential 
family. The derivatives of the partition function C(0) were shown to be related to the second 
order statistical moments of the MRF Gibbs potential. Based on this result, an original 
Monte Carlo method was proposed to compute the Fisher Information matrix of the MRF 
and therefore the CRB. To illustrate the interest of this method, the proposed algorithm 
was successfully applied to the Ising and the Potts models, which are frequently used in 
signal processing applications. The resulting bounds have been used, in turn, to assess the 
statistical efficiency of three state-of-the art estimation methods that are interesting for image 
processing applications. These results revealed that for high temperatures (i.e., low values 
of /3) the estimation methods are close to being statistically efficient. Perspectives for future 
work include the derivation of Bayesian Cramer Rao bounds for intractable models whose 
unknown parameters are assigned prior distributions. 
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